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Abstract 



The celebrated dimension reduction lemma of Johnson and Lindenstrauss has numerous compu- 
tational and other applications. Due to its application in practice, speeding up the computation of a 
Johnson-Lindenstrauss style dimension reduction is an important question. Recently, Dasgupta, Kumar, 
and Sarlos (STOC 2010) constructed such a transform that uses a sparse matrix. This is motivated by 
the desire to speed up the computation when applied to sparse input vectors, a scenario that comes up in 
applications. The sparsity of their construction was further improved by Kane and Nelson (ArXiv 2010). 

We improve the previous bound on the number of non-zero entries per column of Kane and Nelson 
from 0{l/e\og{l/S) \og{k/5)) (where the target dimension is k, the distortion is 1 ± e, and the failure 
probability is 6) to 



We also improve the amount of randomness needed to generate the matrix. Our results are obtained 
by connecting the moments of an order 2 Rademacher chaos to the combinatorial properties of random 
Eulerian multigraphs. Estimating the chance that a random multigraph is composed of a given number 
of node-disjoint Eulerian components leads to a new tail bound on the chaos. Our estimates may be of 
independent interest, and as this part of the argument is decoupled from the analysis of the coefficients 
of the chaos, we believe that our methods can be useful in the analysis of other chaoses. 
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1 Introduction 



The celebrated flattening lemma of Johnson and Lindenstrauss |[T3l has numerous applications in pure math- 
ematics, data analysis, signal processing, computational linear algebra, and machine learning. Informally, 
the lemma states that a random linear transformation mapping to M'^, where k = log (1/5)), pre- 
serves the L2-norm of any x € M'' up to a factor of (1 it e) with probability at least 1 — 5. The original 
argument uses a projection onto a random hnear subspace. However, it turns out that many simpler trans- 
formations work just as well |[T0l l9l[T2l[n[T7l. In particular, a. k x d matrix of {— 1,0, +1} i.i.d. entries, 
and in fact any sub-Gaussian i.i.d. entries, works |[T][nl. What makes the lemma particularly useful are its 
linearity and the fact that the target dimension k depends only on e and 6 but not on d. Alon lU gave a lower 
bound on k demonstrating that the above upper bound is nearly the best possible. 

Due to its application in practice, speeding up the computation of a Johnson-Lindenstrauss style dimen- 
sion reduction beyond the trivial 0{dk) arithmetric operations per vector is an important question. Achliop- 
tas |[T], then Matousek [17], gained constant factors by using a sparse matrix. In their ground breaking 
work, Ailon and Chazelle L2J designed a fast Johnson-Lindenstrauss transform (FJLT) that asymptotically 
beats the 0{dk) bound. Their approach of first applying a preconditioner that "smears" input vectors to 
some extent, then using a structured linear transformation that works well on smeared vectors, is prevalent 
in followup work. Ailon and Liberty f3l gave a better FJLT, whose running time is 0{d\ogk) arithmetic 
operations per input vector. Further results in this vein were given in [ 16 , 3 1. 

Recently, Dasgupta, Kumar, and Sarlos fH revisited the question of designing a sparse JL transform. 
This is motivated by the desire to speed up the computation when applied using a small e to sparse input 
vectors, a scenario that comes up in applications. They construct a random kx d transformation matrix with 
c = O (- log(l/(5) \og^{k/5)) non-zero entries per column. They use a trivial deterministic preconditioner 
P that duplicates each coordinate c times and rescales. The choice of c governs the sparsity of the matrix. 
The novelty of their approach lies in the construction of the projection matrix, whose entries are not inde- 
pendent. This allows them to overcome a lower bound of Q.{e^'^) on the sparsity of a JL transform matrix 
with independent entries [|17| . They construct the projection matrix as follows: pick Q G {—1,1}'^ and a 
hash function h : [cd] — [k] uniformly at random. The k x cd projection matrix H has Hij = Ql^i-^^j. 
Notice that H has a single non-zero entry per column, and the entire transformation HP has c non-zero 
entries per column. Kane and Nelson (14] improve the analysis of this scheme. They show that taking 
c = O (1 log(l/(5) log(fc/(5)) is sufficient. 

We provide alternative, tighter, analysis of this scheme and show that it is sufficient to set 



In both previous papers, as well as this work, the starting point is the same: the argument boils down to 
analyzing the distribution of an order 2 Rademacher chaos Z = '}2ji<i<j<d'^v^iQ^ where the coefficients 
aij are derived from the hash function h and the projected vector x. In particular, showing that the transform 
works for a particular choice of c boils down to proving a tail inequality bounding the probability that Z 
deviates from 0. We prove such a tail inequality by bounding a judiciously chosen large even moment of Z. 

Notice that the monomials in the expansion of E[Z^"^] are (sums of) products of terms in the sum 
defining Z. As each term involves two indices there is a correspondence between monomials and 
graphs on {1, 2, . . . , d}. The non-zero monomials correspond to graphs where all nodes have even degree, 
in other words: unions of node-disjoint Eulerian graphs. The previous papers resorted to existing measure 
concentration inequalities. They implicitly related the moments to the weight of a subset of the monomials 
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where the graphs are composed of pairs of parallel edges, and thus used the combinatorial structure only 
partially. This approach seems to hit a barrier when c = o ( ^ log^ (1/6)). 

In order to overcome this barrier, we fully exploit the combinatorial structure of the monomial terms 
in the expansion ofE[Z2'"]. In particular, we prove non-trivial bounds on the probability that a random 
multigraph is the union of a given number of disjoint Eulerian components (the difficulty stems from the 
fact that this is not a monotone property). These bounds may be of independent interest. Moreover, our 
analysis of the combinatorial structure of the monomials is decoupled from the use of the specific properties 
of the coefficients of the chaos that lead to the specific tail inequality that we get. Therefore, our methods 
are likely to be useful in the analysis of other order 2 Rachemacher chaoses. 

Kane and Nelson fT4] also reduce the required amount of randomness, as compared to the original 
construction of |8|. Our analysis also further improves slightly the bound on the randomness needed. We 
need O (log (1/(5)) -wise independent vectors, whereas Kane and Nelson use O (log (/c/J)) -wise independent 
vectors. 

Definitions, Assumptions and Main Results 

Let < 5, e < 1 be two parameters. We assume that e < log~^ {^~^)- Define m = 0{log5~^) and 
k = 0(e-2m). Define C = 0{e-^ (fm) ) ^"^^ function F such that F{m) = 0(j^^^^). Let 
H : [d] t-^ [k] be a random function and let ( G {— l}'^ be a random vector. Both vectors have 0(m)-wise 
independent entries. Let x € ii'^ be a fixed vector such that | |x| I2 = 1 and | |x| |oo < C^*^'^. Define 

k 

^t= xiXjCiCj'i-{H{i)=H{j)=t}^ ^ =y^,^t- 

ij^je[d] t=i 

We note that for fixed H each variable Zt can be seen as a particular case of Rademacher chaos. Rademacher 
chaos of order 2 is defined as a random variable of the form UijCiCj- Thus, we consider a special 

case when There are many bounds for Rademacher chaos, such as Bonami inequality Q and 

others, see e.g., Blei and Janson Hanson and Wright [1 1], Latala [ 151. In particular, they can be applied 
for each Zt for fixed H. However, there are two issues with applying general inequality in our setting. 
First, we might loose precision, when applied directly to a random sum of (defined hy H ) of Rademacher 
chaoses Z. Second, we can employ the structure of Ojj to achieve better bounds. Our main technical result 
is a new tail probability inequality for a random sum of Rademacher chaoses of the special form as above. 
In particular we prove: 

Theorem 1.1. There exists an absolute constant a such that if C > ae~^ (^ F(m) ) ^'^'^ ^ ^ ae~'^m then: 
E{Z'^"^) < (O.le)^™. Further, there exists an absolute constant 7 such that 

P{\Z\ > e) < 7(5. 

Thus, we give an improvement to Theorem 2 from fSl and Theorem 10 from fTT]. It is important to 
emphasize the difference between our approach and that of 1 8 , 14 1 Both previous works first bound Zi using 
known tail bounds for a Rademacher chaos and then take a union bound for summing the error of all Zi<i<t 
in order to upper bound Z. We, in contrast provide a new tail inequality. 

Next, we note that theorem [TT] immediately implies the following, by repeating the arguments from ISl: 
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Theorem 1.2. There exists a universal constant 7 and a distribution D over kx d matrices with real-valued 
elements such that ifM'-^V then for any fixed x ^ the following is true. First, 



P{{1 - e)||x||2 < ||Mx||2 < (1 + e)||a;||2) > 1 - t-^- 



Second, Mx can be computed in time 




1 / log(l/J)logloglog(l/J) 
e 1^ loglog(l/<5) 



) 
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|a;||o). 



Third, M can be constructed using vectors with O (log{l/ 6)) -independent entries. 
1.1 An Informal Explanation 

We take a direct approach to the above problem and try to estimate the moments of Z directly. That is, we 
write 



Further, Z can be seen as a sum of all possible monomials which can be constructed from 2m elements 



estimate the expectation of term inside each group differently. It turns out that each monomial with positive 
expectation corresponds to a multigraph with positive and even degrees. The expectation depends on the 
number of connected components of such graphs. That is, we reduce the problem of estimating moments of 
Z to the question of how many multigraphs can be constructed for a given subset of vertices {1,2, ... ,i} 
and a given number of connected components t. It is not hard to see that t < i/2 for graphs with even 
degrees. Also, note that there is a direct upper bound on the number of such sequences that is i^"*. 

Informally, we employ the following intuitive fact. If the multigraph has a small number of connected 
components, then the total probability of such a graph is very small. On the other hand, if there are many 
connected components, then the graph should be sparse with o(z^) edges and thus better bounds are pos- 
sible. The main technical work is to prove that for any number of components, the combined influence of 
probability and sparsity in fact gives the required bound. 

2 Reduction to Graphs 

Let S be a sequence of pairs S = {Si, . . . , S2m} where Si = {^i,!, Si^2} such that 1 < < 5^,2 < d. 
Define A to be a set of all such sequences. Define a random variable 





Thus, we group the terms according to certain criteria and 



= n ^Si,iXSi,2CSi,iCSi,2 ^i{H{Si,i)=H{Si,2)=t} 



2m / / k 




i=l \ \t=l 



Fact 2.1. = 2'^"'^s^^E{Rs). 

Proof. We can rewrite: 




The fact follows. 



□ 
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Definition 2.2. Let G be an undirected connected multigraph with G = iy, E) and V C [d\. Define 
W EIGHT[G) = OifG has at least one vertex with an odd degree and otherwise define 

WEIGHT{G) = ^^^\{xf3^^\ 

Let G be an undirected multigraph and let Gi, . . . ,Gt be the connected components of G. Define 

t 



WEIGHT{G) = Yl WEIGHT{Gi 



1=1 



Definition 2.3. Let V (^[d]. Define 

SQUARES{V) = Yl xl- 

Definition 2.4. Let S & A. Define G{S) to be the following undirected multigraph. Vertices of the graph 
are the numbers that appear in the sequence S. That is, the set of vertices of G{S) is {v S [d\ : 3i G 
[2m], j G {1, 2}Sij = v}. The multiset of edges of G{S) consists of all edges of the form (Sj,!, Si^2)- 

Definition 2.5. Let G be a multigraph with vertices in [d\. Define Ver{G) to be the set of all vertices ofG 
with positive degree. Define Edg{G) to be a multiset of all edges of G. 

Lemma 2.6. 

E{Rs) = WEIGHT{G{S)). 

Proof. Definition 12.41 implies that all vertices of G{S) have positive degree. It follows that G{S) has a 
vertex v with an odd degree if and only if Xy has an odd degree in Rs. In this case we can write Rs as C,yL 
where L is independent of C and thus E{Rs) = = W EIGHT{G{S)). 

Consider the case when G{S) has only vertices with positive and even degree. First, let us assume that 
G{S) is connected. 

2iri / k \ 

E{Rs) = n x^(^)i?(n [Y.'^{His.,)=His.,.)=t}]). 

veV 1=1 \t=l / 

Since is connected we have 

2r?i / k \ k ^ 



l{/^(5,.i)=//(5,,2)=t} ) = E{Y^ JJ '^H(v)=t) 



h\y\-^ ' 

i=l \t=l / t=l v(^V 

The case when G has more than one connected component is proven by repeating the above arguments 
for each connected components and by noting that the random variables that correspond to components are 
independent. □ 

Definition 2.7. For Q C [d] and define Wq^t to be set of all sequences S such that Ver{G{S)) = Q, 
such that G{S) has t connected components and such that all degrees in G{S) are positive and even. By 
symmetry, for any Q Q' such that \Q\ = \Q,'\ we have jW^^jj = |TVQ',t|. 
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Lemma 2.8. Let S G Wu^ f Then 



WEIGHT{G{S)) < J-^.^^SQUARES{Ver{G{S))). 



Proof. By Definition 

WEIGHT{G{S)) = -^^ W xf3{v)^ 

v&Ver{G{S)) 



Next, note tlie following. For every v it is true that: deg{v) > 2 and < C ^. Also, J2v£V '^^^(v) = 4m. 
Thus, we conclude: 

WEIGHTiGiS)) < n < = 

v(^Ver{G{S)) 

^tc^^SQUARES{Ver{G{S))). 

□ 

Fact 2.9. Let S i U^™ \JI^^ VFf^j^^. Then E(Rs) = 0. 

Proof. Consider S ^ ufZ!\ Uj^^ j. Then G(S') has at least one node of odd degree. It follows that 
E{Rs) = 0. 

Further, we show that W^j^^^ = for t > i/2. Indeed, consider S G W[i],t- It follows that at least one 
of the connected components of G{S) has exactly one node. This contradicts the definition of sequences S. 
Thus, W[i\^t = and the fact follows. □ 



Lemma 2.10. 



Proof. 



2m , i/2 ^ ^ 



1=1 t=i 



EiZ^"") = 2^™ ^i^s) = (By FactlaH) 

SeA 

2m i/2 

i=l t=l Qe[d],\Q\=iS&WQ,t 
2m i/2 

2^™ EE E E WEIGHT{G{S)) < 

1=1 t=l Qeldl,\Q\=iSGWQ,t 
2m i/2 

2'™EE E E ^tc2^SQUARES{G{S)) < 

i=l t=l Qe[d],\Q\=iS&WQ,t 
2m i/2 

22-J^^|H^j^,J____ Y: SQUARESiQ)< 

i=l t=l Qe[d],\Q\=i 
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i=i t=i ■ j(i[d] 

i=l t=l 



□ 



2.1 Proof of Theorem [LD 

Proof. Let e < m^^. Then by Lemma 12.121 and by Fact 12. 1 1] there exists an absolute constant a such that 
if C > ae~^ ( ^(m) ) ^'^'^ ^ ^ Q;e~^m then the following is true. For any 1 < i < 2m and for any 

l<t< i/2: 

Thus, by Lemma l2.10l for sufficiently large m: 

e2™m2(0.02)2'" < (O.le)^™. 

To show the second claim, note that P{\Z\ > e) < P{Z'^'^ > e^"*). Also, recall that m = 0(log(l/J)). 
Since Z^"^ is a non-negative random variable, the second claim follows from Markov inequality and the first 
claim of the theorem. □ 

Fact 2.11. 

-, / 1 ^ 4m— 2i 



g2m \F{m) 
Proof. Recall that e < and that t < i/2 < m. 

1 1/777 

^-'c'-' > ^("-)^-*«^™-^^ (^) > 

-| 1 / 1 \ 4m— 2j 

g2mgi-2t \F{m)) 

1 / 1 ^ 4m— 2i 



m_)_ 4m+i-5t 
^2m 



F{m] 



□ 



In the remainder of our paper we prove the following main technical lemma. 

Lemma 2.12. Let e < m^^. There exists an absolute constant CONST = 0(1) such that for any 1 < i < 
2m and for any 1 < t < i/2: 

\ 4m— 2i 

F[m) J 

Proof. The lemma follows directly from Lemma [l!2l Lemma [3. 101 and Lemma [3. 121 □ 
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3 Bounding W^i^^t- 

Fact 3.1. There exists a constant v such that for F{m) < and for any x > 0: 

Proof. Follows from the fact that for small constant v. 

F{m)log{F{m)) < O.Ollog(m). 

□ 

3.1 Small t 

Lemma 3.2. Let t < 0.39i and e < m^^. There exists an absolute constant CONST = 0(1) such that for 
any 1 < i < 2m and for any 1 < t < i/2: 



2m ■) 4m+i-5t 



\ 4m~2i 



F(m) 



Proof. It follows from the definition of that |M^[j],t| < i^™. Also note that i < 2m. Also, 2i — 5t > 
0.05i. Thus, there exists a constant (p such that, 

•4m ^ -Am-i ^(^)4m-2i 



4m— 2i — ^ jj^im—i ^O.OSi 



First, consider the case when i < Then the lemma follows immediately. Otherwise, for suffi- 

ciently large m and for some constant il;: 



I F(m) 



4m-2i — ^ j^O.OSi ' 



The lemma follows from Fact 13 .11 □ 
3.2 Some Facts 

Fact 3.3. Let t be such that 3t > i and let S £ l^[i],t- Then G{S) has at least {3t — i)>0 components with 
size exactly 2. 

Proof. Each component must have at least 2 nodes. Thus, there are at most {i — 2t) components with more 
than 2 nodes. Thus, there are at least (3t — i) components of size exactly 2. □ 

Definition 3.4. Define SPARSE^ as a set of all sequences S such that Ver{G{S)) = [i], G{S) has at 
least u components of size two and such that all vertices ofG{S) are of even and positive degree. 

Fact 3.5. Let t be such that 3t > i. Then W[^^t Q SPARSEst-i- 

Proof. Follows directly from Fact l3.3l and the definitions. □ 
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Definition 3.6. Let Q be a set of size (3t — i) of pairs of distinct numbers from [i]. That is 

Q=ufjr^{fe,-i,g2i)} 

such that Qj G [i] and qj ^ qj' for any j ^ j'. Let Q be a set of all such possible Q. For Q Q, 
define CONCRETE[Q) to be a set of all sequences S such that G{S) has connected components with the 
following sets of vertices: {gi, 92}, • • • , {92{3i-i)-i> 92{3i-i)}- 



Fact 3.7. 



Fact 3.8. 



i \(2(3t-i))! 
li'it-i)) (3t-i)! 



SPARSE^^t_i) C J CONCRETE{Q) . 

3.3 Medium t 

In the remainder of the paper we consider the cas^ when t > 0.39i. Denote z = i — 2t. In this section we 
consider the case when t is not very large such that > 2{3t — i). 

Fact 3.9. Let Q be an ordered set of size (3i — i)from Defmition \3.6\ Then there exists an absolute constant 
7 such that 

\CONCRETE{Q)\ < {2mf^^^-'\-fzf'^'^-^^+'\ 

Proof Let A' = ufs^'{{q2j-uq2j)}. Let B = {qi, . . . , g2(3t-i)} and B' = A' U {Ujj'^BjKj'UJ'))- 
use CONCRETE{Q) then S € S'^m, Also, 

\B'\ < 2{3t + 2{3t - i)f = 2{3t - i) + (3(z))^ < Wz^. 

Also, each pair {q2j-i,q2j) must appear at least twice in the sequence S. We count the number of such se- 
quences as follows. First, we choose the 2{3t — i) locations of the appearances for the pairs (g2j_i, 92j)- For 
a fixed set of locations, the number of sequence S that agree on these locations is bounded by |^'|2m-2(3t-i) 
The total number of different sets of locations is bounded by (2m)^(^*^*). This in an over-counting, yet it is 
sufficient for our goals. Thus, we conclude that there exists an absolute constant /3 such that 

\CONCRETE{Q)\ < (2m)2(3*-*)(/3z)4(™-3*+0. 

□ 

Lemma 3.10. Let t > 0.39i such that z^ > 2{3t — i). There exists an absolute constant CONST = 0(1) 
such that for any 1 < i < 2m and for any 1 < t < i/2: 



^ " \F[m) J 



4m-2i 



'in fact we only need 3t > (1 + 7)1 for some constant 7 
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Proof. By Fact |3.9[ Fact 13 .7 1 and Fact 13.81 there exists an absolute constant /3 such that 



Further, 

i \ (2(3t - x2{3i-i) A{m~Zt+i) 

,2{3t-i)J {3t-iy. ^^'^> ' 



N2(3t-j) 4m-12t+4i 



(3t-i)!(3z)! 
Note that 3t - i > O.li. Thus, 



Thus, there exists an absolute constant 7 such that 

To prove the lemma, we need to estimate the following quantity: 

^6t—2i^4m—6t+i-p^^^4m—2i 



j^St—ijj^im+i—Bt 

We show that there exists a constant such that: 

'"' ^ ^ y'"') _ im 

Rewrite: 



We consider the following three cases, there exists a constant il): 



z 



im-6t+ip(^^-^4m-2i (F(m)z)^™~6*+*F(m)-3*+6* 



If i < m and < z < then there exists a constant 7: 



Finally, if max( , y^^) < z then there exists a constant /3: 



^4m-6i+*^(^)4m~2i ^(m)^™ 



< /3" 



jjit—ijYiAm+2i—9tjy^z 



The lemma follows from Fact l3.1[ 



^We stress that this claim is correct for any 1 < i < 2m. 



3.4 Large t 

In this section we consider t such that < 2(3t — i). The proof of the following fact is identical to Fact 
if we note that < 2(3t - i) < i. 



Fact 3.11. Let Q be an ordered set of size 2(3t — i) from Definition \3.6\ Then there exists an absolute 
constant (3 such that: 

\CONCRETE{Q)\ < (2m)2(3t-0(;3i)2(m-3t+i)_ 

Lemma 3.12. Let t be such that < 2(3t — i). There exists an absolute constant CONST = 0(1) such 
that for any 1 < i < 2m and for any 1 <t < i/2: 



\Wii^,t\ < {CONSTy^ilm 



2m 'I 4m+i-5t 



\ Am-2i 



Proof. By Fact l3.1 ll Fact 13 .7 1 and Fact l3.8l there exists an absolute constant /3 such that: 
Further, there exists a constant 7: 

I \ (^(^i^ll)):/2^)2{3i-i)^2{m-3t+j) < m-,/2^)2(3t-i)-2m-9t+3i 

2(3t-i)y {3t-iy. ' - I -y J 

Thus, 



4m-2j 



. < Y"-^ < 



•2m-9^+3^ rpr^ \4'm—2i p/^\4m-22 



for a constant 'i/'- D 
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